The Role of Speech Input in Wearable Computing

نویسنده

Thad Starner

چکیده

89 S peech recognition seems like an attractive input mechanism for wear-able computers, and as we saw in this magazine's first issue, several companies are promoting products that use limited speech interfaces for specific tasks. However , we must overcome several challenges to using speech recognition in more general contexts, and interface designers must be wary of applying the technology to situations where speech is inappropriate. On a tram in Zürich the other day, my research group started to discuss a counter-terrorism scenario I'm writing for the US Defense Advanced Research Projects Agency (DARPA). Going up a hill, the tram became quite noisy, so naturally I adjusted my voice to compensate. The tram stopped, and I suddenly found myself yelling, in very clear English, " Unless, of course, you are an American and get kidnapped by al-Qaeda. " All conversation on the tram ceased. Although amusing, this story is no doubt familiar. Almost everyone has had a similar experience in a crowded restaurant or at a cocktail party. We speak differently in the presence of noise: increased amplitude, reduced word rate, and clearer articulation. 1 The effect is sometimes called Lombard speech, after the French physician who noted it in 1911. Unfortunately, Lombard speech represents a difficulty for using speech recognition on a wearable computer. Most commercial speech recognizers are trained for dictation in an office environment and fail miserably when used in a noisy mobile environment. Although noise-canceling microphones help considerably with reducing noise level, the speaker's change in voice due to ambient noise means that even speech systems trained specifically to understand the user could have problems. In a recent talk, Chalapathy Neti from IBM showed results that begin to address the first problem of mobile speech recognition , that of background noise. 2 IBM tested its speech recognition system in conditions where it kept adding louder " speech babble " background noise to the clean speech to be recognized. The clean speech had a 19.5-decibel signal-to-noise ratio (SNR), and the recognition engine performed at a 12 percent word error rate on this speech. When researchers added additional noise to achieve an SNR of 12 dB, the word error increased to 60 percent. By 0 dB, word error approached 100 percent! Training the system on speech that was degraded by the same level and source of noise helped considerably: under 30 percent error at 12 dB and 80 …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ubiquitous speech processing

In the ubiquitous (pervasive) computing era, it is expected that everybody will access information services anytime anywhere, and these services are expected to augment various human intelligent activities. Speech recognition technology can play an important role in this era by providing: (a) conversational systems for accessing information services and (b) systems for transcribing, understandi...

متن کامل

The Investigation and Analysis of the Strengths, Weaknesses, Opportunities, and Threats of Wearable Electronic Technologies: A Systematic Review

Introduction: Wearable electronic devices, which are based on Internet of Things (IoT) and big data computing, are able to continuously collect and process the physiological and environmental data and exchange them with other tools, users, and internet networks. Therefore, despite their potential benefits in health monitoring, they can pose serious risks, especially in breach of privacy. Hence...

متن کامل

The Investigation and Analysis of the Strengths, Weaknesses, Opportunities, and Threats of Wearable Electronic Technologies: A Systematic Review

متن کامل

Role of Wearable Technology in the Diagnosis and Prevention of COVID-19

Nowadays, the role of technology innovations in precaution, early diagnosis, and treatment is not deniable due to the increase in various types of electronics. Additionally, increasing costs of medical services and lack of medical expertise and specialists in remote regions double the necessity of using technology in this field. Wearable technology comprising a large group of worn gadgets, incl...

متن کامل

Touching the Visualized Invisible: Wearable AR with a Multimodal Interface

Wearable computers and their novel applications demand more context-specific user interfaces than traditional desktop paradigms can offer. We describe a multimodal interface to a wearable computing system and explain how it enhances a mobile user’s situational awareness and provides new functionality. Our mobile augmented reality system visualizes otherwise “invisible” information encountered i...

متن کامل

The Effect of Comprehensible Input and Comprehensible Output on the Accuracy and Complexity of Iranian EFL Learners’ Oral Speech

This study aimed at investigating the relative impact of comprehensible input and comprehensible output on the development of grammatical accuracy and syntactic complexity of Iranian EFL learners’ oral production. Participants were 60 female EFL learners selected from a whole population pool of 80 based on the standard test of IELTS. To investigate the research questions, the participants were ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Pervasive Computing

دوره 1 شماره

صفحات -

تاریخ انتشار 2002

The Role of Speech Input in Wearable Computing

نویسنده

چکیده

منابع مشابه

Ubiquitous speech processing

The Investigation and Analysis of the Strengths, Weaknesses, Opportunities, and Threats of Wearable Electronic Technologies: A Systematic Review

The Investigation and Analysis of the Strengths, Weaknesses, Opportunities, and Threats of Wearable Electronic Technologies: A Systematic Review

Role of Wearable Technology in the Diagnosis and Prevention of COVID-19

Touching the Visualized Invisible: Wearable AR with a Multimodal Interface

The Effect of Comprehensible Input and Comprehensible Output on the Accuracy and Complexity of Iranian EFL Learners’ Oral Speech

عنوان ژورنال:

اشتراک گذاری